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(54) Method and apparatus for generating runlength-linriited coding with DC control 



(57) A sequence of input blocks (310), each com- 
prising p bits, is converted by an encoder (300) into a 
sequence of codewords (314), each comprising q bits, 
according to a lossless coding scheme that maps un- 
constrained binary sequences into sequences that obey 
the (d,k)-RLL constraint while offering a degree of DC 
control A single "overlapping 11 table is used for all states 
rather than using multiple tables. Recognizing that a 



subset of codewords in a first state Xj are identical to a 
subset of codewords in the second state x y the overlap- 
ping encoding table uses identical addresses for the 
subset of identical codewords in the first and second 
state. Thus addresses for more than one state may point 
to a single codeword A number ofjnput bytes can be 
encoded into two different codewords which have differ- 
ent parity of ones, thus allowing for DC control. Decod- 
ing is carried out in a state-independent manner 
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Description 

This invention relates to a system for encoding binary data words into codewords that satisfy prescribed constraint s 
for transmission and thereafter for decoding the codewords into the original binary data words. In particular tms in- 
vention relates to a system of encoding and decoding data which increases information density minimizes the overa II 
DC component of tie transmitted digital code, and minimizes the memory required for the coding table. 

in digital transmission systems and in magnetic and opt.ca! recording/piayoack systems, the information to be 
transmitted or to be recorded is presentee as a bit stream sequence of ones anc zeros. In optical and magnetic recordin g 
systems, the bit stream written intc the device must satisfy certain constraints A common family of constraints are th e 
id, k) runlength-i mited (RLL) constraints which specify that the run of zex>s between consecutive ones in the bit stream 
must have a length of at .east d and a length cf no more than k for tie prescribed parameters d and k Currently, it is 
common for a compact disk to use a code with the constraint (d.<) -{2.10V An example of a sequence satisfying the 
{2.10) constraint is . 0001 00OOOOOC0O1 001 00000 100 in which the first four runlongths arc 3. 10. 2 and 5. Magnetic 
recording standards include the f 1 . 7)-RLL constraint and the ( 1 . 3)- RLL constraint. 

The set of all sequences satisfying a given (d.k)-RLL constraint can be described by reading the labels offo^paths 
n the labeled directed graph as shown in Figure 1 . The parameter k is imposed to guarantee sufficient sign changes 
in the recorded waveform which are required for clock synchronization during read-back The parameter d is required 
to prevent inter-symboi interference 

Another type o* constraint requires controlling the low frequency or DC constant of the inout data stream. Tne DC 
control is jsed in optical recording to avoid problems such as interference with the servo system and to allow filtering 
of noise resulting f-om finger-prints Information channels are not normally responsive to drect current and any DC 
component of the transmitted or recordec signal is likely to be lest. Thus, the DC component of the sequence of symbols 
should be kept as close to zero as possible, preferably at zero This can be achieved by recuir-ng the existence of a 

postive integer B such that any reco'ded sequence w,w 2 w 1 now regarded over the symbol alphabet {+1.-1} will 

satisfy the inequality 



for every l < / < / < f Sequences that obey these conditions are said to satisfy the B-charge constraint. The larger the 
value of 8. the iess reduction there will be in the DC component. 

However, in certain applications, the charge constraint can be relaxed thus allowing higher coding rates. In such 
applications, the DC control may be ach eved by using a coding scheme mat allows a certain percentage of symbols 
(on the average- to reverse the polarity of subsequent symbols . Alternatively. DC control may be achieved by aliowing 
a certain percentage of symbols on average to have alternate codewo'ds with a DC component which is lower o e of 
opposite polarity. 

DC confol and (d.k)-RLL constraints can be combined. In such schemes, the constraint cf binary sequences 
z } z 9 z^ z 1 that satisfy the (d.k)-PLL constraint, such that the respective NRZI seqjences 

have a controlled DC component. 

Referring tc Figure 2 shows a functional block diagram of a conventional encoding/decocinc system 200. In a 
typical example of audio data recorded onto a CD ; analog aucio data from the left and right speakers 202a ; 202b of a 
stereo system are converted into £ bit signal which is input into a data scrambler and error correction code generator 
whose output 210 is transmitted into an encoder 212 comprsed of a channel encoder 214 and a paraileMo-serial 
converter 216. The senai data 220 is writter to a compact disk 222. A similar process is used to decode cata from the 
CD. Data 224 from the CD is input into a decode- 230 comprised o* a serial to parallel converter 23C and a channel 
decoder 232 Data from the CD is decoded, input into an error corrector and descrambler 236 and output as audio 
data 240. 

The encoder 2- 2 is a uniquely -decocable (or lossless) mapping of an uncenstrainecdata stream into a constrained 
sequence. The current standard for encoding compact disk cata is eight-to-fourteen modulation (EFM}. Using EFM 
encoding, blocks of 3 data bits are translated into blocks cf 14 data bits, known as channel bits. EFM uses a lookup 



EP0771078A2 

table which assigns an unambiguous codeword having a length of 14 bits to each 8-bit data word. By choosing the 
right 14-bit words, bit patterns that satisfy the (2,10) constraint, high data density can be achieved. Three additional 

bits called -merge bits are inserted between the-14 bit codewords. These three bits are selected to ensure the<2 ; 10) 

constraint is maintained and also to control the low frequency or DC content of the bit stream. The addition of these 

s three merge bits makes The effective rate of this coding scheme 8:17 (not 8:14). 

Demands for higher data density are increasing with the advent of multimedia, graphics-intensive computer appli- 
cations and high-quality digital video programming. A proposal described in the article "EFMPius: The Coding Format 
of the Multimedia Compact Disc", Proc. 16th Symp. on Inform. Theory in the Benelux, Nieuwerkerk a/d Yssel, May 
18-19, 1995, describes an encoding/decoding system which increases data density compared to EFM coding. In the 

10 system proposed in the EFMPius article, both the encoder and decoder for constrained data take the form of a f inrte- 
state machine. A rate p:q finite-state encoder accepts an input block of p-bits and generates a q-bit codeword depending 
on the input block and the current state of the encoder. Trie sequences obtained by concatenating the generated q- 
bit codewords satisfy the constraint. In optical storage devices, the p-bit input block is typically taken to be an 8 bit byte 
so that it matches the unit size used in the error-correction scheme. 

is The proposed EFMPius scheme is a rate 8:1 6 finite state encoder for the (2, 1 0)-RLL constraint which increases 

its data density compared to the EFM scheme. The encoder is however a more complex four state encoder with each 
state requiring 256 + 88 sixteen bit codewords. (The 88 codewords are alternate codewords which are used to control 
the DC content) 

A method and apparatus of encoding and decoding binary data which increases information density, minimizes 
—so — the overall DC component of the transmitted digital code, and minimizes the memory required for the encoding and- 
decoding tables is needed. 

The present invention describes a lossless coding scheme that maps unconstrained binary sequences into se- 
quences that obey the (d, k)-RLL constraint while offering a degree of DC control. The lossless coding scheme provides 
a method and apparatus for encoding and decoding binary data which increases information density relative to EFM 
2S coding and minimizes the overall DC component of the output constrained sequences. Further, the coding scheme 
attempts to minimize the memory required for the encoding and decoding tables. Memory size is decreased compared 
to the EFM coding scheme. 

According to a first aspect of the present invention, there is provided a method of encoding a sequence of input . 

blocks each block comprised of p bits, into a sequence of codewords each codeword comprised of q bits, the sequence 

30 of codewords satisfying a d-constraint such that consecutive bits of one type characterized by a transition are separated 
by at least d bits of another type characterized by an absence of a transition, and satisfying a k-constraint such that 
no more than a maximum of k bits of the other type occur between successive bits of the one type, comprising the 
steps of: receiving a p bit input block; and converting each input block into a corresponding codeword of q serial bits 
using an encoder, the encoder including an encoding table, the encoding table being representative of a number of: 
3$ states x 1 .x^.....,^, wherein at least a subset of the codewords in a first state are identical to a subset of codewords in 
a second state, and further wherein the addresses for the subset of codewords in the first state are identical to the 
addresses for the subset of codewords in the second state. 

In a preferred method the address of the codeword in the encoding table corresponding to the input block is de- 
termined by appending a prefix having a fixed number of bits to the input block. 
40 The prefix may have more than one possible value resulting in more than one possible codeword value. Preferably, 

the value of the prefix appended to the input block is determined based on a comparison of the value of the input block 
to a first threshold value and to a second threshold value. 

In addition the value of the first threshold value and the value of the second threshold value may depend on the 

current state of the encoder. 

45 -The more than one possible codeword values may end in a final run that leads to the same state, may have different 
parities, or may have different values of the DC component. 

The method may include the further step of defining the codewords of the encoding table, wherein the codewords 

are defined prior to the step of converting each input block into a corresponding codeword. 
Defining the codewords may include the steps of: 

so ... .... _ 

determining the adjacency matrix Aq; 

computing an approximate eigenvector of the adjacency matrix Aq] 
- deleting states with zero weight; 

merging at least a subset of the states having the same weight _ . 

- ss responsive to the deleted states and merged states, determining a new adjacency matrix A^ 

and reducing the number of encoder states. 

The method may include the further step of reordering the codewords to satisfy DC control conditions. This step 
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may include the further steps of: 

maximizing the number of adcresses a and a + 2? containing codewords that lead to the same next state: and 
maximizing the number of addresses a and a + 2? where the codewords have different parities 

5 

The step of reducing the number of encoder states may induce the step of defining the order on the outgoing 
edges for each state in the adjacency matrix A H . 

The step of reducing the number of encoder states may further include the steps of: 

io establishing all the worcs that satisfy the (d.ki constraint: 

deleting edges that were deleted during formation of graph H; and 

deleting words ending in runlength greater than the edges that were deleted during formation of the graph H. 

in one embodiment, the step of reordering the codewords to satisfy the DC control conditions further includes the 
*5 steps of: 

maximizing the number of adcresses a and a + 2P containing codewords that lead to the same next state: and 
maximizing the number of addresses a and a + 2P where the codewords have different values of tne DC component. 

20 According to a second aspect of the present invention, there is provided a method of encoding and decoding a 

sequence of input blocks, each block comprised of p bits, into a sequence of codewords each codeword comprised of 
q bits, the sequence of codewords satisfying a d-constrainl such that consecutive bits of one type characterized by a 
transition are separated by at least d bits of another type characterized by an absence of a transition, and satisfying 
a k-constraint such that no more than a maximum of k bits of the other type occur between successive bits of the one 

25 type, comorising the steps of: receiving a p bit input block; converting each input block into a corresponding codeword 
of q serial bits using an encoder the encoder including an encoding table, the encoding table being representative of 

a number of states x 1 x 2 . x n wherein at least a subset of the codewords in a first state are identical to a subset of 

codewords in a second state, and further wherein tne addresses for the subset of codewords in the first state are 
identical to the addresses for the subset of codewords in the second state: and decoding each codeword. 

30 According to a third aspect of the present invention, there is provided a coding system for encoding and decoding 

a sequence of input blocks each block comprised of p bits, into a sequence of codewords each codeword comprised 
of q bits, tne sequence of codewords satisfying a d-constraint such that consecutive bits of one type characterized by 
a transition are separated by at least d bits of another type characterized by an absence of a transition, and satisfying 
a k-constraint such that no more than a maximum of k brts of the other type occur between successive bits of the one 

35 type, the system comprised of: 

a convening means for converting each input block into a corresponding codeword of q serial bits using an encoder. 

the encoder including an encoder table, the encoding table being representative of a number of states x 1r x 2 , 

x n , wherein at least a subset of the codewords in a first state are identical to a subset of codewords in a second 
40 state, and further wherein the addresses for the subset of codewords in the first state are identical to the addresses 

for the subset of codewords in the second state: and 
a decoding means for decoding each codeword. 

In the preferred embodiment, the channel encoder is a state machine which uses a single "overlapping" table for 
45 all states rather than using multiple tables Recognizing that a subset of codewords in a first state Xj are identical to a 
subset of codewords in the second state x j: the overlapping encoding table uses identical addresses for the subset of 
identical codewords in the first and second state. Thus addresses for more than one state may point to a single code- 
word. A number of input bytes can be encoded into two different codewords which have different parity of ones, thus 
allowing for DC control. Decoding is carried out in a state-independent manner. 
so The encoder is a finite-state machine that maps inout blocks to codewords. The encoder design is based on a 

method of choosing codewords and their sequence using state splitting, state merging and state deletion techniques 
such that a s;ng:e table may be constructed for mapping unconstrained binary sequences into sequences that obey a 
(d.k) runlength constraint (herewith 6=2 and k=lG. or 12) and a fixed-rate (either 8: 15 or 8:16). The encoder is a finite- 
state machine consisting of four or more states. The encoder can achieve DC Offset control by choosing between 
55 output codewords with opposite "parity. * 

The main building block of the encoder is a table of codewords that serves all states. It has a simple addressing 
scheme for selection of a codeword or its opposite parity codeword, which simplifies the address circuitry. Encoding 
is carried out by prefixing the input block with a fixed number of bits (two bits in the provided examples) whicn depend 
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on the Input block as well as the current encoder state. The result is an address to the table from which the current 
encoded codeword is taken. Assuming random input, the probability of being at any given state is independent of the 

previous state,- which allows advantage to be taken of the statistical randomness of the data. — - 

The encoder features DC control by allowing for a number of input blocks to have two possible encoded codewords. 

s The parity (number of 1 T s) is different in the two possible codewords and so the respective NRZI sequences end with 
a different polarity, thus allowing the reversai of the polarity of subsequent codewords. The ability to replace codewords 
with codewords of opposite polarity allows control of the accumulating DC offset. Since the "final bits" or "final run" of 
all codewords and their opposite parity codewords are matched, subsequent encoding is not affected by which is 
chosen (a codeword or its opposite parity mate). This facilitates using "look ahead* to optimize DC control. In those 

io cases where DC control is possible, the address of the alternate codeword is obtained by adding a fixed number (2 s 
in the provided examples) to the computed address. 

The encoder preferably has a sliding-block decoder which can be efficiently implemented using an associative 
memory which contains the encodertable. The current input block is recovered from the current codeword and, possibly, 
a prescribed number of subsequent codewords, in the first example, the input byte can be fully recovered from the 

is address in the table where the current received codeword is located, i n the second example, the most significant seven 
bits of the current input byte can be obtained in a similar manner. As for the least significant bit (LSB), it may be 
computed from the address as well, or depending on the current codeword, by determining which half of the table the 
next codeword is located in 

Error propagation can be limited by the use of sliding-block decoders. A slidtng-block decoder makes a decision 
20 on a-gTverrrecerved q-bit codeword on the basis of the local context of that codeword in the received sequence: the - 
codeword Itself, as well as a fixed number m of preceding codewords and a fixed number a of later codewords. Thus, 
a single error at the input to a sliding-block decoder can only affect the decoding of most m+a+1 consecutive codewords. 

A further understanding of the nature and advantages of the present invention may be realized with reference to 
the remaining portions of the specification and the attached drawings. 
25 Figure 1 shows a graphical representation of a (d,k)-RLL constraint 

Figure 2 shows a functional block diagram of a conventional encoding/decoding system. 

Figure 3 shows a schematic diagram ol a (2,10)-RLL encoder according to the present invention. 

Figure 4 shows an encoding table for a (2,1 0)-RLL encoder according to the present invention. * 

Figure 5 shows the formatoffrie address to the encc^jng table. ; 

3Q Figure' 6 "shows a table of thresfwld values and the table of prefixes for a (2,1 0)4RLL encoder. 

Figure 7 shows an encoding table for a (2,12)-RLL encoder according to the present invention. 

Figure 8 shows the runlength next state dependency for a (2,1 0)-RLL Levei-4 encoder according to the present 
invention. " 

Figure 9 shows a table of threshold values and the table of prefixes for a (2,12)-RU_ Level-4 encoder. ^ 
3$ Figure 1 0 shows the runlength next state dependency for a (2, 1 2}-RLL Level-4 encoder. 

Figure 11 shows the runlength next state dependency for a (2,12)-RLL Level-8 encoder 

Figure, 1 2 shows a table of threshold values and the table of prefixes for a (2, 1 2)-RU_ Level-S encoder. 

Figure 13 summarizes the states that can be deleted and the redirection required. 

Figure 14 summarizes the percentage of random input bytes, on the average, for which DC control is possible. 
40 Figure 1 5 shows a graphical presentation of the (2,1 2)-RLL constraint 

Figure 16 shows an adjacency matrix ofG. 
Figure 17 shows an adjacency matrix of H. 

Figure 1 8 is a schematic diagram showing the location of codewords in the table that can be generated from each 

state in H, _ 

45 - Figure 1 9 shows matrix D £ t 

Figure 2D shows the stationary probability of encoder states. 

jh e present invention provides a method and apparatus for encoding and decoding a sequence of input blocks 
each block comprised of p bits. The input blocks are encoded into a sequence of codewords each codeword comprised 
of q bits, the sequence of codewords satisfying a d-constraint such that consecutive bits of one type characterized by 

so a transition are separated by at least d bits of another type characterized by an absence of a transition, and satisfying 
a k-constrainf such that no more than a maximum of k bits of the other type occur between successive bits of the one 
type. The method of encoding includes the steps of: receiving a p bit input block; and converting each input block into 
. . . a corresponding codeword of q serial bits using an encoder, the encoder including an encoding table, the encoding 
table being representative of a number of states x 1 ,x 2 ,.. ...... x^ 

55 in the present invention, the encoder is a state machine that uses a single "overlapping' table for all states rather 

than using multiple tables. Recognizing that a subset of codewords in a first state Xj are identical to a subset of code- 
words in the second state Xj, the overlapping encoding table uses identical addresses for the subset of identical code- 
words in the first and second state. Thus the addresses of the subset of codewords in the first state are identical to 
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the addresses for the subset of codewords in the second state. 

The coding system of the present invention can be implemented according to a functional block diagram simitar 
to that shown in Figure 2 and similarly includes both a means for encoding and decoding a sequence of Input blocks. 
However, dlffe rences such as the configuration of the encoding table and method of addressing into the encoding tabfe 
5 are described in detail below. Although the encoding tabfe is stored in memory and referred to as a "single" overlapping 
table, rt is not required that the "single" table be stored in sequential memory addresses. 

The following detailed description provides two examples utilizing the encoding and decoding system according 
to the present invention. The first example describes a four-state {Z 10)-RLL encoder at rate 8: 1 6, using a table of 546 
codewords. The percentage of bytes that allow for DC control is 49.7%, on the average. Decoding is carried out by 
to recovering the input byte from the current 1 6-bit codeword. 

The second exampledescribesa(2,12)-RLL encoder at rate 8:15, usingatabfeof 551 codewords. The percentage 
of bytes that allow for DC control ranges between 7.6% and 1 2.2%, depending on the number of states of the encoder 
which, rn turn, can range between four and eight. Decoding is carried out by recovering the input byte from the current, 
and possibly the next 1 5-bit codeword. 

15 

I. Coding Scheme for the (2.10)-RLL Constraint 

The following description is related to a coding scheme for a (2,10)-RLL constraint Figure 3, shows a schematic 
diagram of a (2,10)-RLL encoder 300. Referring to Figure 3, shows an input byte 310 which is encoded using an 
20 encoding table 312 to an output codeword 314. The encoder can be in any one of x v x^-x^ states. In the preferred 
embodiment for a (2,10)-RLL constraint, n=4. Thus at each encoding step, the encoder 300 can be in one out of four 
states: SO, S1 , S2-5, or S6-8. Each state is associated with a range of final runlengths which is reflected in the state 
name. For example, state S6-8 is associated with the runlengths 6, 7, and 8. 

In the preferred embodiment, the (2,10)-RLL encoding table 312 consists of 546 codewords, each having 16 bits. 
25 Figure 4 shows one possible encoding table 31 2 for a (2, 1 0)-RLL encoder 300 according to the present invention. The 
codewords shown in Figure 4 are in hexadecimal form. 

After receiving the p bit input block 31 0, the encoder 300 converts each input block 310 into a corresponding q bit 
codeword 314 using the encoding table 312. The address 316 of the encoding table 312 is determined by prefixing a 
predetermined value 318 having a fixed number of bits to the input block 310. Figure 5 shows the format of an address 
so to the encoding table 312. In the preferred embodiment, the predetermined prefix value 318 is a two bit value. Thus, 
a ten-bit address 31 6 is formed by prefixing two bits to the input block 31 0 resulting in an address as shown by Figure 5. 

The two-bit predetermined prefix value 31 8 depends on how the value of input block 31 0 (as an integer) compares 
with a first threshold value and a second threshold value, T1 (320a) and T2 (320b). These thresholds 320a, 320b, in 
turn, depend on the current state ofthe encoder 300. 
35 Figure 6 shows a table of threshold values and the table of prefixes for a (2, 1 0)-RLL encoder. The notation "^"stands 

here for the "don't care" sign. Referring to Figure 6 for example, shows for a current state of S6-8, the input byte will 
be 038 (decimal) and the corresponding codeword address will be 256 + 038 = 294. 

The output codeword 314 is the entry in encoding table 312 at the computed address. The next encoder state is 
the unique state with an associated runlength range that includes the last runlength of the generated codeword. The 
40 final runlength of the codeword determines the next state ofthe encoder. For the example shown in Figure 3, the output 
codeword 314 is 01 00001 0001 0010 (4212 in hexadecimal notation) and the next encoder state is S1. 

As can be seen in Figure 6, there are cases where more than one prefix is possible, resulting in two different 
codeword candidates. Assuming random input, the percentage of bytes within an input sequence for which two code- 
word candidates exist is 4g.7% on the average. 
45 The encoding table 31 2 is designed so that those codeword candidates will have different parity of sign changes 

(namely, different parity of number of 1 "s), thus allowing for DC control. Furthermore, both codeword candidate choices 
lead the encoder to the same state and, therefore, replacement of a codeword with its alternate can be done within an 
output stream without affecting preceding orfoiiowing codewords. For example, if the current state is S1 and the input 
is 70 (decimal), then the output codeword can be either 00001 0000001 DO01 or 01 001 0000001 0001. Both codewords 
so leadtostate SO. The decoder can recoverthe inputbyte regardless of the specific codeword candidate that was chosen. 

In the preferred embodiment the codeword candidates have diff erent parity of sign change thus allowing for DC 
control in an alternative embodiment, codeword candidates have DC components that are different and preferably of 
opposite sign thus allowing for DC control by choosing the codeword which reduces the accumulated DC. 

The encoding table 312 shown in Figure 4 does not contain the (hexadecimal) words 8111, 8121, 8421, 8821, 
S5 8812, 8822, 9124, 9244, 8408, 8810, as well' as none ofthe ten 16-bit words in the (2,1 0)-RLL constraint that end with 
a runlength of 9 or 10. Therefore, these words can be used for synchronization. By reducing the threshold 034 in Figure 
4, more codewords at the end of the encoding table 312 can be reserved for special use. 

Decoding of the (2,10)-RLL system is carried out in a state-independent manner, using the encoding table 312 
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which is assumed to reside in an associative read-only memory: First, an address is found of a table entry centos 
the received codeword. The input byte 31 0 is then obtahed by truncating the two most signif icant bits (MSB s) o f the 
"address wh ere the codeword is located.- 



II. Coding Scheme for the (2.12VRLL Constraint 

The following example is related to a coding scheme a (2, 1 2)-RLL constraint. The encoding algorithm has several 
encoding levels, depending on the acceptable number of states and the desirable DC control. In the basic level (referred 
to as Level-4), the encoder has four states (n=4) and DC control is possible in 7.6% of the input bytes, on the average 
Extended levels are obtained by adding more states, reaching eight states (n=B) in the Level-8 encoder. The extended 
level (n=8) allows for DC control in 12.2% of the input bytes, on the average. 

The encoder structure for Level-4 (2,12)-RLL constraint is as shown in Figure 3. However, instead of usmg the 
encodertableof Figure4,the encodertable of Figure7 is used. Figure 7 fean encoder table for a (2,1 2)-RLL constraint 
The encodertable in Figure 7 consists of 551 codewords, each of length 15 bits. The encodertable 312 can be divided 
into two sections: The first section contains the first 292 codewords, and the second contains the remaining 259 code- 
words 

For n=4 the encoder can be in one out of four states: SO, S1 . S2^a, or S2-6b. The specific state depends on the 
last runlength of the previous codeword and the LSB of the previous input byte. The runlength state dependency for 
tie (2, 1 2}-RLL encoder according to the present invention is shown in Figure 8. . . . . „ . , 

Simiia7to'thT(2,1D)-RLl enc^ef previously described, encoding is carried out by comparing the input byte b to 
threshold values and adding suitable two-bit prefixes. The threshold values and the prefixes fora (2,12) encoder are 
shown in Figure 9. A thresholds (respectively, T2) lor a state S will be denoted hereafter by T1 (S) (respectively, T2(S)). 

The output codeword is the entry in Figure 7 at the computed address. The next encoder state is determined by 
Figure 8. DC control is attained by allowing certain input bytes to have two different codeword candidates. Assuming 
25 random input the percentage of such input bytes within an input sequence is 7.5% on the average. 

If T2(S2-6b) is changed to 036, then the codewords at addresses 548, 549, and 550 will never be used. In addition 
the encodertable does not contain any of the ten 15-biJ words in the (2,12)-RLL constraint that end wnh a runlength- 
of 9 or more. This makes all those words available for synchronization. Alternatively, T2(S2-6a) and T1 (S2-6b) can be r 

increased to any (same) value between 036 and T2(S2-6b). __ — __ 

Similar to the (2,10)-RLL decoder previously described, decoding is carried out in a state-independent manner; 
using the encoding table. First, an address is found of atable entry that contains the received codeword. If the codeword 
ends with a runlength of 0, 1 , 7, or 8, then that codeword appears only once in the table. In this case, the input byte is 
obtained by truncating the two MSB's of the address where the codeword is located. 

In case the codeword ends with a runlength of 2, 3, 4, 5, or 6, then it appears twice in the encoder table, in two 
adjacent locations, the first of which has an even address (therefore, the two addresses differ in their LSB onry^By 
truncating the two MSB's of the address, all the bits of the input byte, with the exception of its LSB, are determined. In 

- order to recover the LSB of the input byte, the next received codeword is found m the encoder table 312. If it s located . 

in the first section of the encoder table shown in Figure 7 (i.e., at an address smaller than 292), then the LSB of the 
input byte is 0. Otherwise, it is 1. , . 

ao Alternatively, when the received codeword ends with a runlength of 2, 3, 4, 5, or 6, the criterion for determining 

the LSB of the input byte can be modified provided that the following changes are made: (a) increase T2(S2-6a) and 
" T1 (S2-6b) to 038, and (b) switch between the table contents at the following address pa.rs: 036<-> 041 , 037<-> 048, 
292<->297 and 293 <-> 304. With those changes, the LSB of the current input byte can be determined by the first 

and second runlengths of the next codeword according to tabl e shown in Fi gure 10, wh enever the current cod eword 

~aS ends with a runlength of2 through 6. ~ M . . . . 

Alternatively, the levels of encoding may be extended. The following example describes a (2, 12)-RLL coding having 
eight states The advantage of using extended levels of encoding is having more input bytes, on the average, where 
DC control is possible, at the expense of increasing the number of states of the encoder. Assuming random input the 
percentage of input bytes which have two codeword candidates reaches 12.2% on the average, compared to 7.6 /<, in 

50 LeV ti eW3 is an extended encoding level in which the decoder has eight states. Each state is determined by the test 
runlength of the previous codeword and the LSB of the previous input byte as can be seen in the table shown in Figure 

H . Encoding is carried out by comparing the input byte b to threshold values and adding suitable prefixes, according 

... to the table shown in Figure 12. .. ' - ■ ■ . 

The last three codewords in the encoder table can be used for synchronization if T2(S2-6a) and T1(S7-B) are 

changed to 036. ■' ■ . ... 

There are 215 input bytes at state S7-8 for which two output codewords are possible. In each such pair, both 
codewords lead to the same encoder state, and aJ but 16 pairs allow for Decontrol (namely, there are 16 pairs where 
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the two codewords have the same parity of number of 1 "s). in all other states, all replacement codewords allow lor DC 
control 

Some states in Level-8 can be deleted, thus generating lower encoding levels. When a state is deleted, the code- 
words that led the encoder to that state need to be redirected into another state. The states that can be deleted and 
* the redirection required are summarized in the table shown in Figure 1 3. In particular, the Level-4 (2,12)*RLL encoder 
previously described is obtained by deleting states S2a, S3a, S4a, and S7-S, in which case state S5-6a becomes state 
S2-6a. Figure 14 summarizes the percentage of random input bytes, on the average, tor which DC control is possible, 
for certain configurations of deletion of states, 

The decoder described for the (2,1 2)-RLL can be used as is for extended encoding levels as well. The criteria for 
70 determining the LSB of the input byte can be similarly modified, except that now all thresholds T2 equaling 036 in 
Figure 12, as well as T1 (S2-6b), need to be changed to 033. 

111. Code Design 

is in this section, we outline the principles that guided the design of the coding and encoding systems previously 

discussed. Before the step of converting each input block into a corresponding codeword the encoder table must be 
defined. Although various design methodologies maybe used, in the preferred methodology the codewords are defined 
by the following steps: determining the adjacency matrix Aq, computing an approximate eigenvector of the adjacency 
matrix Aq; deleting states with zero weight; merging at least a subset of the states having the same weight; responsive 

20 to the deleted states and merged states, determining a new adjacency matrix A^ and reducing the number of encoder 
states. 

The following Is a description of the design methodology for a (2,12)-RLL constraint. The first step in the design 
methodology is the step of determining the adjacency matrix Aq. Let G denote the graph presentation of the (2,12)- 
RLL constraint which is shown in Figure 15. The adjacency matrix of G, shown in Figure 16 and denoted Aq : is a 1 3 
25 x 13 matrix whose rows and columns are indexed by the states (vertices) of G and the (u,v)th entry of Aq, denoted 
(Aq) u .v equals the number of edges from state u to state v in G. That is, for 0 < u t v< 12, 



1 

30 




if 2 * 12 and v = 0 



0 otherwise 



The graph G^ is obtained from G in the following manner The set of states of is the same as that of G : and 
each edge in G^ corresponds to a path of length q in G, beginning at the initial state of the first edge of the path in G 
40 and ending at the terminal state of the last edge of the path in G. The label of an edge in G<* is the word generated by 
the respective path in G. The adjacency matrix of Go equals (Aq)Q. For q = 15 we get the matrix in Figure 16. 

A. State Merging and State Splitting 

45 To obtain a rate 8:15 finite-state encoder, we first invoke the technique of state splitting which is described in the 

references "Algorithms tor sliding block codes - an application of symbolic dynamics to information theory, " Adler et 
af., IEEE Trans. Inform. Theory, 29 (1983), 5-22, "Constrained systems and coding for recording channels," Marcus et 
al. to appear in Handbook of Coding Theory, R.A. Brualdi, W.C. Huffman, V. Pless (Eds.), Elsevier, Amsterdam, and 
also appeared as TR 839, Computer Science Department, Technion, Haifa, Israel, December 1 994, and "Finite-state 

so modulation codes for data storage," IEEE J. Sel. Areas Comm., 10 (1992), 5-37. We start by computing an (Aq 15 > 2 s ) 
approximate eigenvector, which is a nonnegative nonzero integer vector x = [x u ] us0 12 such that Aq 15 x > 2^, where the 
inequality holds component by component The entry x^ is referredto as the weight of state u. An (Aq 15 , ^-approximate 
eigenvector with components adding up to the smallest sum possible is given by 

ss T 

x = [1 122222110000], 

In particular, any (Aq 15 , 2 8 )-approximate eigenvector must contain a component which is greater than 1 . It follows 
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that there is no finite-state encoder tor the {2,12)-RLL constraint at rate 8:15 that has a sliding-block decoder with m 
= a = 0 The encoder we construct has a sliding-block decoder with m = 0 and a = 1 . Furthermore, seven bits of the 

current input byte can be determined from the current received codeword alone. The LSB of the current input byte «~ 

the only bit which may require the next received codeword for decoding. (We remark that in the case of the (2, 1 0)-RLL 
s constraint, there is an approximate eigenvector which is a 0-1 vector. Indeed, in this case, we have managed to obtain 
a sFriing-block decoder with m = a = 0. Such encoders are called block decodable. 

After tie step of computing an approximate eigenvector, states having zero weight are deleted and states having 
the same weight are merged States 9 through 12 have zero weight and therefore can be removed from G 15 with all 
their incoming and outgoing edges. States 7 and 3 have the same weight and, in addition, all words that can be gen- 
ts erated by paths beginning at state 8 in G 15 can also be generated starting at state 7. Therefore, state 7 can be merged 
into state 8 by redirecting edges incoming to state 7 so that they terminate in state S, thereby allowing the deletion of 
state 7, Similarly, states 2 through 5 can be merged into state 6. Indeed, when doing so, we will end up with the Level- 
5 encoder which is defined in Figure 14. To obtain the Level-8 encoder, we merge only state 5 into state 6. (The same 
applies with respect to the (2,10) encoder previously described: we could merge states 2 through 7 into state 8 to 
is obtain a (2,10)~RLL encoder with three states; however, to gain more DC control, we merged states 2 through 4 into 
state 5 and states 6 and 7 into state 8, resulting in four states altogether). 

After merging and deleting states, we obtain a graph H with the following seven states: SO, SI , S2, S3, S4, S5-6, 
and S7-8 States SO through S4 correspond to states 0 through 4 in G« state S5-6 is obtained by merging state 5 
into state 6 in G 15 , and state S7-8 is obtained by merging state 7 into state 8 in G 15 . Note that the label- (word) of each 
20 edge In HI uniquely determ1nes~the terminal state oTthaf edge; in fact, the lasf runlength in that word identrfieslhat " 
terminal state. Such graphs H are said to have memory 1 . 

The adjacency matrix of H is given by Ah and is shown in Figure 1 7 with an ( A^^-approximate eigenvector y 

= [1 1 22221] T . 4 _ 

States with weight greater than 1 need to be split When a state u is split, two or more descendent states are 

25 formed The incoming edges to u are duplicated into each of the descendent states, whereas the outgoing edges from 
u are partitioned among the descendent states, in addition, the weight of u is shared among the descendent states so 
that after the splitting, the following holds: The weights of the descendent states are positive integers that sum up to 
the weight of their parent state, and the weights of the terminal states of the edges outgoing from each descendent 
state v sum up to at least 2»_times the weight of v. An encoder is obtained after several rounds of splitting, when ail 

30 weights become 1 . in the graph H, there are four states that need to be split, namely, states S2, S3, S4, and S5-6. It 
can be verified tot each of these states can be split into two states, resulting in descendent states each having at 
least 2* outgoing edges. In fact, after splitting, most states will have more than 2* outgoing edges, which will allow 
having alternate codewords and hence DC control. We point out that the encoder in the reference, "EFMPlus: The 
Coding Format of the MuitiMedia Compact Disc", Proc. 16th Symp. on Inform Theory in the Benelux, Nieuwerkerk al 

35 d Yssel, May 1 8-1 9, 1 995, the (2,1 0)-RLL constraint can be obtained in a similar manner by one state splitting. On the 
other hand we did not have to apply state splitting in order to obtain the previously described (2,1 0)-RLL encoder 

Straightforward splitting of the four states with weight 2 in H may result in 11 encoder states. To reduce the number 

of encoder states and to obtain a compact codeword table, we turn to the next design step of defining a certain order 
on the outgoing edges (or rather, their labels) from each state in H. The specific order we choose will allow for simple 

40 decoding rules. 

To this end, we write down all words of length 15 that satisfy the (2,1 2)-RLL constraint, according to descending 
order of their first runiength. We omit words thai end with a runiength of 9 or more, since the respective edges were 
deleted when H was formed. On the other hand, each word ending with a runiength of 2, 3, 4, 5 or 6 will be written 
twice in two consecutive places. Indeed, those words correspond to edges that were duplicated due to the splitting of 

■45 states" S2; S3, S4, and S5-6HTie resulting list consists of 551 words and will serve as the codeword table of the encoder 
If we count edges with their multiplicity according to the weight of their terminal state, we find that there are 
{ a v v 28 outgoing edges from state SO in H and their labels start with runlengths between 2 and 1 2. Those labels 
appear as the first 2* codewords in the table. Similarly, the first runiength of the labels of the edges outgoing from state 
Si ranges between 1 and 11. Those labels appear in the table at (Ah) s1 ^ = 374 consecutive entries, starting at 

so address 002 (the third entry). 

Figure 1 8 is a schematic diagram showing the location of codewords in the table that can be generated from each 
state in H The encoder table 410 is a table of 551 codewords, divided into runiength intervals according to their first 
runiength. Some runlengths have been combined in the figure; e.g., all codewords that start with a runiength between 
2 and 4 are marked as one runiength intervaL 

ss The two-sided vertical arrows 41 2 mark the locations of codewords that can be generated from each state in H. 

The number of codewords (counting multiplicity) that correspond to each state are written in parentheses under the 
state name: For every state u, this number is equal to^Atf^ .Note that the run lengths were clustered into run length 
intervals in the table in such a way that the interval boundaries separate between segments of the table that correspond 
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to different states in H. 

Figure 18 aiso shows a splitting of states S2, S3, S4, and S5-6 which is marked by the dashed line. In each one 
of those states, the outgoing edges are partitioned so that edges labeled by codewords that are located at addresses 
< 292 belong to a descendent state that inherits the name of the parent state with a suffix "a". The rest of the edges 

s belong to the other descendent state thai inherits the name of the parent state with the suffix "b". The number 2S2 was 
chosen so that state S5-6a will have at least 2 s outgoing edges, in order io have a valid splitting, we make sure that 
the entries at addresses 29 1 and 292 do not contain two copies of the same codeword. Each incoming edge to a state 
that was split is duplicated to enter both descendents of thai state, and both new edges inherit the same label of the 
parent edge. A duplicated codeword in the table corresponds to such a duplicated edge. We will follow a convention 

10 whereby the first copy in the table of such a codeword corresponds to an edge entering state "a", whereas the second 
copy enters state "b". 

The following observations can be made: 

* All codewords that can be generated from a given state form a contiguous segment of the table. Using an obser- 
75 vation made in the reference "Sequence-state methods for run-length-limited coding, 0 Franaszek et af., IBM J. 

Res. Develop., 14 (1970), 376-383, a contiguous segment can be generated for any (d, k)-RLL constraint. Seg- 
ments of the table that correspond to different states can overlap, thus resulting in a compact table that serves all 
states 

20 • States S2b. S3b. S4b. and S5-6b are equivalent in the sense that the sets of codeword sequences that can be 
generated from each one of those states are the same. Therefore, we can combine those states into one state 
which we call S2-6b. 

* AH states, with the exception of SO and S5-6a, have more than 2 s outgoing edges States SI and S7-6 have 
25 significantly more edges. 

The current table already allows for a coding scheme as follows: Encoding is carried out by adding a two-bit prefix 
to the input byte as shown in Figure 5. The two-bit prefix is chosen so that the resulting address falls within the address 
range of the (contiguous) segment of the table that corresponds to the current state, as determined by Figure 1 8. Since 

30 the table segment of each state consists of at least 2 s codewords, such a prefix can always be found. In fact, in many 
cases more than one prefix is possible. Finding the right prefix can be translated into threshold comparison. 

We demonstrate this on state S2a. The table segment that corresponds to this state occupies addresses a in the 
range 005 < a < 292. If the input byte is a number b in the range 005 < b < 256, then b can serve as the address to 
the table from which the codeword is to be taken. Otherwise, if b < 005, then we obtain the address by adding 2 s (or 

35 prefixing 'Of'} to b. Notice that prefixing '01' yieids a valid address also when b < 036. Therefore, we set the thresholds 
T1 and T2 lo 005 and 036. respectively, and encoding will proceed as follows: 

ifb < T1 , then the prefix is '01 '; otherwise. 

if b < T2, then the prefix can be either '01 ' or W: otherwise, the prefix is '00'. These thresholds and prefixes coincide 
40 with those in Figure 12. 

Decoding is carried out as follows: If the received codeword appears once in the table, then the input byte is 
uniquely determined by the address of that codeword. Mote that this holds regardless of the state from which this 
codeword was generated during encoding. If the codeword appears twice, then we need to verify whether the next 

•45 encoder state was an "a" descendent of a state or rather state S2-6b Codewords that can be generated from s a" states 
appear in the table at addresses smaller than 292, whereas codewords that can be generated from state S2-6b appear 
at addresses > 292. Therefore, by determining which section of the table the next codeword is located in the decoder 
can fully recover the input byte. 

Recall that codewords that appear twice in the table occupy consecutive addresses. If in addition, we manage to 

50 put the first codeword in each such pair at an even address, then the most significant seven bits of the current input 
byte will be determined by the current received codeword. Such an assignment of addresses can be easily obtained 
in our case. 

B. DC Control 

55 

So far, we have shown how a table can be constructed so that encoding and decoding are efficient and error 
propagation is limited. We will now reorder the codewords in our table so that the structure of Figure 18 is maintained, 
while satisfying additional conditions to allow for DC control. 
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More specifically, let a and a + 2 s be two addresses in the table, both belonging to the same table segment cor- 
responding to some state in H. Then, we require that, to the largest extent possible, the following two conditions will 
hold for every ^uch pair of addresses: ~ ~~ " 

5 (ci) The terminal states of the codewords at addresses a and a + 2 s should be the same. If this condition is 

satisfied, then we can interchangeably encode any one of those two codewords, without affecting subsequent 
encoded codewords. 

(C2) The codewords at addresses a and a + 2 s should have different parities (of number of 1*s). The option to 
choose between these two codewords during encoding will yield the desired effect of DC control 

10 

In our case, for all 002 < a < 551-2 8 = 295 ) there is a state in H such that both a and a + 2 s belong to the segment 
corresponding to that state in the table. For the sake of simplicity, we will apply conditions (C1) and <C2) also to the 
remaining addresses 000 and 001. This way, we obtain a "semi-periodic" table where the codewords at every pair of 
addresses at distance 2 s apart have the same terminal state and different parities (to the largest extent possible). 
15 We reorder the table using the following procedure. 



n«2*;M = 551; 

s<-0; ... _ _ 

20 while(s + n<M){ 

/ - min ft | > s and tow is a runiength interval boundary for 

some/ =0.1, [s/nj ); 

h - min (/ 1 > s and t+n is a runiength interval boundary); 
\£(lsh){ 

Reorder [s n t h ri) so that [s+n, matches [s, l)\ 
s-h } 

else { " j | 

Reorder [mi. l-ni) simultaneously for j-0,1, ...., Ls/m so 

that (s-ni, h-ni) match [s+n % h+n)\ 

35 



Table portions that start at address x and end at address y-1 (inclusive) are denoted in the procedure by [x,y). The 
basic idea of the procedure is scanning the table, starting at address 000, and identifying table portions (s,1 ) such that 

40 for every i = 0,1 , . L s/2 8 J , each of the portions [s-2 8 , i-2 8 1) is entirely contained in some runiength interval in the table. 
We then look at a new portion [s+2 8 h+2 8 ), which is also entirely contained in some runiength interval. If 1<h, then 
we reorder the portion [s+2 8 , h+2 8 ) so that its (1 .s)-prefix matches the portion (s, 1 ), codeword by codeword, In terms 
of conditions (C1 ) and (C2). This might not always be possible, but in many cases it is, partly because in many cases 
h is significantly larger than 1 . If 1 > h T then we reorder the portions [s-2 8 i, t-2?i) simultaneously for i = 0,1 . 

45 to match [s+2 8 ; h+2 8 ). ~~ ~ " " """ 

Applying this procedure to our table, we obtain a reordering of the table where condition (C1 ) is fully met, whereas 
condition (C2) is satisfied for all but 1 6 pairs of addresses. The procedure would have found a full matching had there 
existed one. 

This procedure can be generalized to obtain a semi-periodic table for any table-based rate p: q encoder for a (d, 
so k)-RLL constraint with q > k that has been constructed along the lines of Sections 111 (namely the encoder is obtained 
by state splitting of a graph presentation H of a subset of the (d, k)-RLL constraint, and then constructing an encoder 
table in which codewords are ordered according to their first runiength and duplicated according to the weights of their 
terminal states in H). The procedure will find a full matching if there exists one whenever ad6re$s 2? in the table is an 
interval boundary of one of the runiength ranges. The procedure can be easily adapted to handle also the case where 
$5 there are two interval boundaries in the table at a distance which is close to 2P. 

The state diagram of the resulting encoder is a graph e with eight states, and the adjacency matrix of e is an 8 x 
8 matrix A£ where all the rows are equal to [59 40 28 1 9 1 3 1 5 75 7 ]; 

Rows and columns are indexed by the encoder states, according to the following order SO, S1 , S2a, S3a, S4a, 
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S5-6a, S2-6b, and S7-8. The entries in do not take into account alternate codewords that are used for DC control. 
These, in turn, are counted separately in the following matrix D € which is shown in Figure 19. Referring to the matrix 
D e , 16 additional codewords that could have been generated from state S7-8 have been omitted since they do not 
satisfy condition (C2). 

5 The peculiarity of having equal rows in for all encoder states is a consequence of the fact that the terminal 

states are the same for codewords that are located in the table at addresses which are at distance 2 s apart Therefore, 
the distribution of the terminal states of the first 2 s outgoing edges from a given encoder state is the same for all encoder 
states. 

Figure 20 shows the stationary probabilities of the encoder states. The stationary probability of being at a given 
10 encoder state u is obtained by dividing (A^ (for an arbitrary v) by 2 s . These probabilities are summarized in the 
second column of the table shown in Figure 20. The entries in the third column of the table shown in Figure 20 are 
obtained by dividing each row-sum in D e by 2 s . These numbers measure for each encoder state the fraction of input 
bytes that allow for DC control from that state. The expectation of those numbers, with respect to the stationary prob- 
abilities, yields the average fraction of input bytes in a random sequence that allow for DC control. This average is 
is equal to 0.1219. 

It is worthwhile pointing out that the Shannon capacity of the (2,12}-RLL constraint is 0.547 and, therefore, the 
encoder operates at a rate, 8:15, which is just 2.5% below capacity. Furthermore, the effective Shannon capacity of a 
(2, 1 2)-RLL constraint having a DC control in 1 2. 1 9% of the input bytes is 0.536, which means that the Level-S encoder 
operates at a rate which is only 0.5% below the effective capacity. 

20 The discussion in this section has concentrated on the design of the Level-S encoder. Reduction in the number of 

states (at the expense of a more limited DC control) can be obtained by merging and deleting states according to the 
table shown in Figure 1 3. The incoming edges to a deleted state u are redirected to one of the states v with the property 
that every codeword sequence that can be generated from v can also be generated from u (and, in fact, it can be 
gsner3tsc from u for the same sequence of Input bytes}. -Similar design tools can be applied to obtain the previously 

25 described (2,10)-RLL encoder of Section 2, as well as encoders for other certain (d,k)-RLL constraints. 

It is understood that the above description is intended to be illustrative and not restrictive. For example, the coding 
system may be based on other constraints than the (d,k)-RLL constraint. Further, the arrangement of codewords or 
table size may vary. For example, the table size may vary dependent on the number of states split, merged, and deleted. 
Further, depending on the design, there may be more than one alternative codeword candidate. The scope of the 

30 invention should therefore not be determined with reference to the above description, but instead should be determined 
with reference to the appended claims, along with the full scope of equivalents to which such claims are entftled. 



Claims 

35 

1 . A method of encoding a sequence of input blocks (31 0) each block comprised of p bits, into a sequence of code- 
words (314) each codeword comprised of q bits, the sequence of codewords satisfying a d-constraint such that 
consecutive bits of one type characterized by a transition are separated by at least d bits of another type charac- 
terized by an absence of a transition, and satisfying a k-constraint such that no more than a maximum of k bits of 

40 the other type occur between successive bits of the one type, comprising the steps of: 

receiving a p bit input block; and 

converting each input block into a corresponding codeword of q serial bits using an encoder (300), the encoder 

including an encoding table (312), the encoding table being representative of a number of states x, ,x 2 , } x n , 

45 wherein at least a subset of the codewords in a first state are identical to a subset of codewords in a second 

state, and further wherein the addresses for the subset of codewords in the first state are identical to the 
addresses for the subset of codewords in the second state. 

2. The method recited in claim 1 wherein the address of the codeword in the encoding table corresponding to the 
50 input block is determined by appending a prefix having a fixed number of bits to the input block. 

3. The method recited in claim 2 wherein the prefix may have more than one possible value resulting in more than 
one possible codeword value. 

55 4. The method recited in claim 3 wherein the value of the prefix appended to the input block is determined based on 
a comparison of the value of the input block to a first threshold value and to a second threshold value, wherein 
the value of the first threshold value and the value of the second threshold value depend on the current state of 
the encoder. 
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5. The method recited in claim 1 wherein the final runlength of the codeword determines the next stage of the encoder. 

6.— The method recited in claim 1 wherein each of the codewords in the coding table has a unique value.— — 

5 7. The method recited in claim 1 further including the step of defining the codewords of the encoding table, wherein 
the codewords are defined prior to the step of converting each input block into a corresponding codeword. 

8. The method recited in claim 7 wherein defining the codewords of the encoding table includes the steps of: 

determining the adjacency matrix Aq; 

computing an approximate eigenvector of the adjacency matrix Aq; 
deleting states with zero weight; 

merging at least a subset of the states having the same weight; 

responsive to the deleted states and merged states, determining a new adjacency matrix A^ and 
reducing the number of encoder states. 

A method of encoding and decoding a sequence of input blocks (310), each block comprised of p bits, into a 
sequence of codewords (314) each codeword comprised of q bits, the sequence of codewords satisfying a d- 
constraint such that consecutive bits of one type characterized by a transition are separated by at least d bits of 
another type characterized by an absence of a transition, and satisfying a k-constraint such that no more than a 
maximum of k bits of the other type occur between successive bits of the one type, comprising the steps of: 

receiving a p bit input block; 

converting each input block into a corresponding codeword of q serial bits using an encoder (300), the encoder 

including an encoding table (312), the encoding table being representative of a number of states x 1( x 2 -x n , 

wherein at least a subset of the codewords in a first state are identical to a subset of codewords in a second 
state, and further wherein the addresses for the subset of codewords in the first state are identical to the., 
addresses for the subset of codewords in the second state; and ; . 

decoding each codeword. 

A coding system for encoding and decoding a sequence of input blocks (31) each block comprised of p bits, into 
a sequence of codewords (314) each codeword comprised of q bits, the sequence of codewords satisfying a d- 
constraint such that consecutive bits of one type characterized by a transition are separated by at least d bits of 
another type characterized by an absence of a transition, and satisfying a k-constraint such that no more than a 
maximum of k bits of the other type occur between successive bits of the one type, the system comprised of: 

a converting means for converting each input block into a corresponding codeword of q serial bits using an 
encoder (300), the encoder including an encoder table (312), the encoding table being representative of a 

number of states x 1 ,x 2 , x n , wherein at least a subset of the codewords in a first state are identical to a subset 

40 of codewords in a second state, and further wherein the addresses for the subset of codewords in the first 

state are identical to the addresses for the subset of codewords in the second state; and 
a decoding means for decoding each codeword. 
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( PRIOR ART) 



Q-ZA 
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Input byte 




3)2 

■A 



0000100000010001 



0100100000010001 



Encoder table 
(546 codewords) 



314 

Output 
codeword 



FIG. 3 



; 



9 8 7 6 5 4 3 2 1 0 



1 Prefix 



Input byte b 
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ADDRESS 



CONTENTS OF TABLE (HEXADECIMAL) 





n 
U 


1 
{ 


z 


j 


A 
*t 


>j 


6 

U 


7 

/ 




Q 


n 


nn91 




0054 


0020 


0041 


0081 


0101 


0201 


0249 


0049 


in 

1 u 


0089 


0091 


0109 


out 


0121 


0042 


0O82 


0102 


0202 


0092 


?n 


0112 


0122 


0044 


0048 


0084 


0088 


0090 


0124 


0224 


0244 






0040 


0080 


0100 


0209 


0211 


0221 


0241 


0212 


0222 


40 




oior 


0110 


0120 

V i ZU 


0204 


0208 


0210 


0220 


0240 


0242 


*sn 

JU 


(1401 


0449 


0489 


0491 


0801 

vuv * 


0849 


0889 


0891 


0909 


0911 


DU 


HQ91 

U3Z I 


104Q 


1089 


1091 


0409 


0411 


0421 


0441 


0481 


0809 


/U 


nRi 1 


0R71 

UOZ I 


0841 


0881 


0901 


0402 


0492 


0802 


0892 


0912 


an 


uyzz 


inn? 
1 uuz 


041? 


049? 

U*rZZ 


044? 


048? 


0812 


0822 


0842 


0882 


on 


oqo9 

U3UZ 


0404 




0410 


04?0 


0804 


0808 


0810 


0820 


0924 


1 nn 

lUU 


n47z 

U4Z<t 


0444 


044R 


04 R4 


04R8 


0490 


0824 


0844 


0848 


0884 


1 in 

1 JU 


uooo 


nson 
uoyu 


Ui7U4 


OQ0R 
v»JUO 


0910 


0990 


1094 
t vz » 


1044 


0440 


0480 


IzU 


U04U 


uogu 


nonn 
uyuu 


1 1 (\Q 


Mil 


1191 

i IZ 1 


< zu^ 


1911 

IZ i 1 


1??1 

IZZ > 


1941 




IUUj 


mi 1 


iUZ I 


1041 


1 AS 1 
i VC * 


1101 


1901 

■ Zv J 


1949 


109? 


1112 

l I 1 z 


i ji n 
J 4U 


1 10? 
t JZZ 


f Z I Z 


1 ZZZ 


1749 


1019 


10?? 
i vzz 


104? 


108? 


1102 

1 l Vi 


12Q2 


« f~ r\ 

IDU 


* AA l 

IUU4 


1 AAfl 

IUUO 


* n« n 
1 U JU 


I uzu 


4 1 Oi 
I IZ*f 


1 oo-« 
I zz*t 


1 Z"t"T 


1948 


1048 


1084 


icn 


iuoo 




1 inz 

i i U4 


1 10R 

i 1 UO 


i nn 

1 1 1 u 


1 i?n 

1 1 zu 




1208 


1210 

It ty 


1220 

i Lty 


1 7n 
I /u 


man 
\ U4U 


man 

lUoU 


i inn 

1 1UU 


1940 


904Q 


?0RQ 


90Q1 


9109 

Z * Us? 


?1 11 


2121 


i on 


oono 
zzuy 


001 1 
ZZ 1 1 


7771 
ZZZ I 


9941 
ZZt - ' 


940Q 


7411 

Z" 1 » 


7491 


9441 


2481 

Z *U 1 


2009 


i on 


oni i 
zu f i 


onoi 

ZUZ l 


ZU*t * 


90R1 


9101 

Z l v » 


?701 

ZZU 1 


9949 


9401 


2449 


2489 


zUU 


zu^z 


01 to 
z nz 


017? 
Z iZZ 


9919 
ZZ 1 z 


9999 


9949 
zztz 




942? 

Z"ZZ 


7442 


2482 


oi n 
zlu 


onio 

ZU iZ 


?n?? 
zuzz 


on4? 

ZU*rZ 


9089 

ZuOZ 


910? 

Z I l/Z 


??0? 


9402 


2492 


2004 


2008 


oon 
ZZU 


omn 
Zv lu 


onon 
zuzu 


0174 
Z 1 Z*t 


7774 
ZZZ*t 


9944 


994R 

ZZtu 


9424 


9444 


9448 


2484 


oxn 
zjU 


74RR 


74 on 


7074 
ZUZf 


9044 


9048 


7084 


9088 


2090 


2104 


2108 


z4U 


01 m 
z i *u 


?1 7n 
z izu 


??n4 

ZZU4 


790R 

ZZUO 


9910 

ZZ J v 


9990 
zzzu 


9404 


9408 

Z ~ v/vJ 


9410 


2420 


o^n 

ZJU 


ZU4U 


ooro 


oi nn 

Z 1 UU 


7940 


9440 


9480 


7491 


40?? 


4024 

"wit 


4020 


0£n 
zou 


4U4 1 


Anai 

4UO J 


ami 

4 1 U 1 


4701 

HZU 1 


494Q 


*tV"T3 


4089 


4091 


4109 


4111 


o*?n 
Z/U 


41z l 


4U4Z 


4UoZ 


4 I UZ 


4?n? 

^ZUZ 


4009 


41 19 


4199 

*r 1 zz 


4044 


4048 


Zou 


HUGH 


zorr 


40Qn 


4194 
*t i Zt 


4994 


4944 


4248 


4040 


4080 

"VUU 


4100 


?qo 
zyu 


4700 


<tz I I 


4991 

*iZZ 1 


4941 
it* i 


421? 


4222 


4104 


4108 


4110 

"1 IV 


4120 


300 


4204 


4208 


4210 


4220 


4240 


4242 


4401 


4449 


4489 


4491 


310 


4801 


4849 


4889 


4891 


4909 


4911 


4921 


4009 


4021 


4011 


320 


4409 


4411 


4421 


4441 


4481 


4809 


4811 


4821 


4841 


4881 


330 


4901 


4402 


4492 


4802 


4892 


4912 


4922 


4012 


4412 


4422 


340 


4442 


4482 


4812 


4822 


4842 


4882 


4902 


4404 


4408 


4410 


350 


4420 


4804 


4808 


4810 


4820 


4924 


4424 


4444 


4448 


4484 


360 


4488 


4490 


4824 


4844 


4848 


4884 


4888 


4890 


4904 


4908 


370 


4910 


4920 


4010 


4008 


4440 


4480 


4840 


4880 


4900 


9109 


380 


9111 


9121 


9209 


9211 


9221 


9241 


9009 


9011 


9021 


9041 


390 


9081 


9101 


9201 


9249 


9092 


9112 


9122 


9212 


9222 


9242 
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ADDRESS 
(DECIMAL) 






CONTENTS OF TABLE (HEXADECIMAL) 






0 


1 


2 


3 


4 


5 


6 


7 


8 


9 


400 


9012 


9022 


9042 


9082 


9102 


9202 


9004 


9008 


9010 


9020 


410 


8108 


9224 


8090 


9248 


9048 


9084 


9088 


9090 


9104 


9108 


420 


9110 


9120 


9204 


9208 


9210 


9220 


9040 


9080 


9100 


9240 


430 


8449 


8489 


8491 


8909 


8911 


8921 


8201 


8011 


8021 


8041 


440 


8401 


9091 


8801 


9049 


8081 


8409 


8411 


8221 


8241 


8481 


450 


8901 


8209 


8049 


8441 


8841 


8089 


8492 


8912 


8922 


8012 


460 


8022 


8042 


8402 


9002 


8802 


8082 


8412 


8422 


8442 


8482 


470 


8902 


8212 


8842 


8092 


8204 


8208 


8110 


8120 


8924 


8220 


480 


8044 


8048 


8420 


8404 


8210 


8084 


8088 


8410 


8824 


8844 


490 


8848 


8884 


8888 


8890 


8904 


8908 


8910 


8920 


8224 


8248 


500 


8010 


8020 


8424 


8448 


8490 


8124 


8840 


8880 


8900 


8040 


510 


8100 


8080 


8Q91 


8122 


9024 


8024 


8811 


8881 


8109 


8211 


520 


8809 


8849 


9089 


8891 


8101 


8889 


8249 


8242 


8882 


8112 


530 


8222 


8892 


8102 


8202 


8444 


9044 


8484 


8488 


8244 


8104 


540 


8820 


8804 


8808 


8240 


8480 


8440 
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State 


Threshold* (decimal) 


Prefixes (binary) 


Tl 


T2 


0 < b < Tl 


Tl < b < T2 


T2 < b < 255 


SO 


000 


001 


* 1 


01 or 00 


00 


SI 


004 


123 


01 


01 or 00 


00 


S2-5 


034 


050 


10 or 01 


01 


01 or 00 


S6-8 


034 


174 


10 or 01 


01 


01 or 00 



FIG. 6 



Last runlesgth of previous codeword 


lus.b. of previous input byte 


Current state 


0 

1 

2, 3, 4, 5, or 6 
2, 3, 4, 5, or 6 
7 or 8 


0 

1 
4 


SO 

SI 
S2-6a 
S2-6b 
S2-6b 



FIG. 8 



State 


Thresholds (decimal) 


Prefixes (binary) 


Tl 


T2 


0<6<T1 


Tl < b < T2 


T2 < o < 25o 


SO 


000 


000 


I 


* 


00 


SI 


002 


120 


01 


01 or 00 


00 


S2-6a 


036 


036 


01 


01 or 00 


00 


S2-6b 


036 


039 


10 


10 or 01 


01 



Fia 9 
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--ADDRESS" 


UUNItNib lABLt (HtXAUtUMAU 


(DECIMAL) 


o 


1 


2 


3 


4 


5 


6 


7 


8 


g 


0 


0004 


0004 


0008 


0008 


0009 


0011 


0010 


0010 


0012 


0021 


10 


0024 


0024 


0020 


0020 


0022 


0041 


0081 


0049 


0089 


0091 


20 


0042 


0082 


0092 


0080 


0044 


0044 


0084 


0084 


0048 


0048 


30 


0088 


0088 


0090 


0090 


0040 


0040 


0101 


0102 


0202 


0201 


40 


0249 


0109 


At 1 « 

01 11 


0121 


0209 


0211 


0221 


#"1*1 M < 

0241 


0112 


0122 


50 


0212 


0222 


0242 


0100 


0104 


0104 


0204 


0204 


0124 


0124 


60 


0224 


0224 


0244 


0244 


0108 


0108 


0208 


0208 


0248 


0248 


70 


0110 


0110 


0210 


0210 


0120 


0120 


0220 


0220 


0240 


0240 


80 


0401 


0449 


0489 


0491 


0409 


0411 


0421 


0441 


0481 


0809 


90 


0402 


0492 


0412 


0422 


0442 


0482 


0812 


0480 


0404 


0404 


100 


0424 


0424 


0444 


0444 


0484 


0484 


0408 


0408 


0448 


0448 


110 


_0488 


0488 


0410_ 


_0410 


0490 


0490. 


_0420 


0420 


0440- 


_0440- 


120 


0801 


0849 


0889 


0891 


0909 


0911 


0921 


1001 


1049 


1089 


130 


1091 


1109 


1111 


1121 


1209 


1211 


1221 


1241 


0811 


0821 


140 


0841 


0881 


0901 


1009 


1013 


1021 


1041 


1081 


1101 


1201 


-t dry 
!JU 


* tiA 






t\f\4 


vail 


\M\iL 




* « «i 
■ 1 m 


f f A*» 
1 tii 


««i « 

H\i 


160 


1222 


1242 


0822 


0842 


0882 


0902 


1012 


1022 


1042 


1082 


170 


1102 


1202 


0880 


0900 


1080 


1100 


0804 


0804 


0924 


0924 


180 


1004 


1004 


1124 


1124 


1224 


1224 


1244 


1244 


0824 


0824 


190 


0844 


0844 


0884 


0884 


0904 


0904 


1024 


1024 


1044 


1044 


200 


1084 


1084 


1104 


1104 


1204 


1204 


0808 


0808 


1008 


1008 


210 


1248 


1248 


0848 


0848 


0888 


0888 


0908 


0908 


1048 


1048 


220 


1088 


1088 


1108 


1108 


1208 


1208 


0810 


0810 


1010 


1010 


230 


0890 


0890 


0910 


0910 


1090 


1090 


1110 


1110 


1210 


1210 


240 


0820 


0820 


0840 


0840 


1020 


1020 


1040 


1040 


0920 


0920 


250 


1120 


1120 


1220 


1220 


1240 


1240 


2004 


2004 


2008 


2008 


260 


-2009 


2011- 


2010- 


2010 


2012 


2021 


2024 


2024 


2020- 


2020 


270 


2022 


2041 


2081 


2001 


2049 


2089 


2042 


2082 


2002 


2080 


280 


2044 


2044 


2084 


2084 


2048 


2048 


2088 


2088 


2090 


2090 


290 


2040 


2040 2101 


2102 


2202 


2201 


2249 


2091 


2109 


2111 


300 




2209 


2211 


7091 
Lit. I 


9AQ9 


i. \ \L 


9199 


991 9 


9999 


9inn 


310 


2104 


2104 


2204 


2204 


2124 


2124 


2224 


2224 


2244 


2244 


320 


.2108 


2108 


2208 ... 


2208 


2248 


2248 


2110 


2110 


2210 


2210 


330 


-2120 


2120- 


2220- 


2220- 


2240 


2240- 


2401- 


2449 


2489 " 


2491 


340 


2241 


2409 


2411 


2421 


2441 


2481 


2402 


2492 


2242 


2412 


350 


2422 


2442 


2482 


2480 


2404 


2404 


2424 


2424 


2444 


2444 


360 


2484 


2484 


2408 


2408 


2448 


2448 


2488 


2488 


2410 


2410 


370 


2490 


2490 


2420 


2420 


2440 


2440 


4009 


4011 


4021 


4041 


380 


4081 


4101 


4201 


4249 


4401 


4449 


4489 


4491 


4801 


4849 


390 


4889 


4891 


4909 


4911 


4049 


4089 


4091 


4109 


4111 


4121 


400 


4209 


4211 


4221- 


4241 - 


4409 - 


4411 


4421 


4012 


4022 


4042 



FIG.7A 
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ADDRESS 



CONTENTS OF TABLE (HEXADECIMAL) 



(DECIMAL) 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 


410 


4082 


4102 


4202 


4402 


4492 


4802 


4892 


4912 


4002 


4092 


420 


4112 


4122 


4212 


4222 


4242 


4412 


4422 


4442 


4480 


4880 


430 


4900 


4080 


4024 


4024 


4044 


4044 


4084 


4084 


4104 


4104 


440 


4204 


4204 


4404 


4404 


4004 


4004 


4124 


4124 


4224 


4224 


450 


4244 


4244 


4424 


4424 


4444 


4444 


4484 


4484 


4824 


4824 


460 


4844 


4844 


4048 


4048 


4088 


4088 


4108 


4108 


4008 


4008 


470 


4248 


4248 


4448 


4448 


4488 


4488 


4848 


4848 


4888 


4888 


480 


4908 


4908 


4090 


4090 


4110 


4110 


4210 


4210 


4010 


4010 


490 


4490 


4490 


4890 


4890 


4910 


4910 


4120 


4120 


4220 


4220 


500 


4240 


4240 


4420 


4420 


4440 


4440 


4020 


4020 


4040 


4040 


510 


4920 


4920 


4804 


4804 


4208 


4208 


4441 


4481 


4410 


4410 


520 


4482 


4809 


4884 


4884 


4820 


4820 


4812 


4811 


4901 


4841 


530 


4881 


4921 


4882 


4902 


4922 


4100 


4904 


4904 


4924 


4924 


540 


4408 


4408 


4808 


4B08 


4810 


4810 


4840 


4840 


4821 


4822 


550 


4842 
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First ranks gth in next codeword 


Second runlength in next codeword 


Decoded bit 


2 or mors 


4 | 


0 


i 


5 or more 


0 


i 


2, 3, or 4 


t 


0 


* 


i 



FIG. 10 



Last ranlength of previous codeword 


IiS.b. of previous input byte 


Current state 


0 


4 


SO 


1 


4 


SI 


2 


0 


S2a 


3 


0 


S3a 


4 


0 


S4a 


5 or 6 


0 


S5-6a 


2, 3, 4,T, or 6 


1 


S2-6b 


7,8 


4 


S7-8 



FIG. 11 



State 


Tbxesboids (decimal) 


Prefixes (binary) 


Tl 


T2 


0 <b< Tl 


Tl < b < T2 


T2<6<256 


SO 


000 


000 


4 


4 


00 


SI 


002 


120 


01 


01 or 00 


00 


S2a 


003 


036 


01 


01 or 00 


00 


S3a 


009 


036 


01 


01 or 00 


00 


S4a 


015 - 


- 036 


01 


01 or 00— 


00 


S5-6a 


036 


036 


01 


01 or 00 


00 


S2-6b 


03S 


039 


10 


10 or 01 


01 


S7-8 


039 


080 


10 or 01 


01 


01 or 00 



FIG. 12 
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State deleted 


Redirection required to state 


S2a 


S3a, S4a, or S5-6a 


S3a 


S4a or S5-6a 


S4a 


S5-6a 


S7-a 


S2-6b 



FIG. 13 



Encoding level 


States deleted 


Percentage of bytes with DC control 


Level-S 


None 


12.2% 


Level-7 


S4a 


11.8% 


Level-6 


S3a and S4a 


Ll.0% 


Level-5 


S2a, S3a, and S4a 


9.7% 


Level-4 


S2a, S3a, S4a, and S7-8 


7.6% 



FIG. 14 




FIG. 15 
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' 59 


40 


28 


87 


59 


40 


126 


87 


59 


125 


86 


59 


124 


85 


58 


122 


84 


57 


119 


82 


56 


115 


79 


54 


109 


75 


51 


100 


69 


47 


87 


60 


41 


68 


47 


32 


40 


28 


19 



19 13 9 6 4 

28 19 13 9 6 

40 28 19 13 9 

40 27 19 13 9 

40 27 18 13 9 

39 27 18 12 9 

38 26 18 12 8 

37 25 17 12 8 

35 24 16 11 8 

32 22 15 10 7 

28 19 13 9 6 

22 15 10 7 5 

13 9 6 4 3 

FIG. 16 



3 


2 


1 


1 


1 " 


4 


3 


2 


1 


1 


6 


4 


3 


2 


1 


6 


4 


3 


2 


1 


6 


4 


3 


2 


1 


6 


4 


3 


2 


1 


6 


4 


3 


2 


1 


5 


4 


3 


2 


1 


5 


3 


3 


2 


1 


5 


3 


2 


2 


1 


4 


3 


2 


1 


1 


3 


2 


2 


1 


0 


2 


1 


1 


1 


0 



59 40 28 19 13 15 7 

87 59 40 28 19 13 10 

126 87 59 40 28 32 15 

Au - 125 86 59 40 27 32 15 

124 85 58 40 27 31 15 

119 . 82 56 : 38 - 26 30 14: 

109 75 51 35 24 27 13 



FIG. 17 
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412 



Intervals 
12 11 10 



SO 
(256) 



SI 

(374) 



S2a 
(287) 



S2b 





i i 


S3a 


S4a , 


(283) 


(277) ' 


! 


f ir 
k * b 


S3b 


S4b \ 


(259) 

v 


(259)! 

p t r 





Interval 8-7 


036 


Interval of 
ranlength 6-5 


080 


Interval of 

codewords 
starting with | 
runlength 4-2 


256 
292 

) 


Interval of 

•""giSeworcg"""" 

starting with 
ranlength i 


376 


Interval of 
codewords 
starting with 
ranlength 0 



FIG. 18 



State 


Stationary probability 


DC control 


SO 


0.2305 


0 


SI 


0.1563 


0.4609 


S2a 


0.1094 


0.1211 


S3a 


0.0742 


0.1055 


S4a 


0.0508 


0.0820 


S5-6a 


0.0586 


0 
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